Mixtral MOE 2x10.7B is a large language model based on the Mixture of Experts architecture, which combines the advantages of two base models, Sakura-SOLAR-Instruct and CarbonVillain. This model performs excellently in text generation tasks and has been evaluated on multiple public datasets, including benchmark tests such as AI2 Reasoning Challenge, HellaSwag, and MMLU.
Natural Language Processing
Transformers